OcrV1, Main, Exploration, bibRecord, 000C09

Movie/Script: Alignment and Parsing of Video and Text Transcription

Identifieur interne : 000C09 ( Main/Exploration ); précédent : 000C08; suivant : 000C10

Movie/Script: Alignment and Parsing of Video and Text Transcription

Auteurs : Timothee Cour [États-Unis] ; Chris Jordan [États-Unis] ; Eleni Miltsakaki [États-Unis] ; Ben Taskar [États-Unis]

Source :

Lecture Notes in Computer Science [ 0302-9743 ] ; 2008.

RBID : ISTEX:4D113318F9911978071D0A7B8FD0031994AF3C74

Abstract

Abstract: Movies and TV are a rich source of diverse and complex video of people, objects, actions and locales “in the wild”. Harvesting automatically labeled sequences of actions from video would enable creation of large-scale and highly-varied datasets. To enable such collection, we focus on the task of recovering scene structure in movies and TV series for object tracking and action retrieval. We present a weakly supervised algorithm that uses the screenplay and closed captions to parse a movie into a hierarchy of shots and scenes. Scene boundaries in the movie are aligned with screenplay scene labels and shots are reordered into a sequence of long continuous tracks or threads which allow for more accurate tracking of people, actions and objects. Scene segmentation, alignment, and shot threading are formulated as inference in a unified generative model and a novel hierarchical dynamic programming algorithm that can handle alignment and jump-limited reorderings in linear time is presented. We present quantitative and qualitative results on movie alignment and parsing, and use the recovered structure to improve character naming and retrieval of common actions in several episodes of popular TV series.

Url:

https://api.istex.fr/document/4D113318F9911978071D0A7B8FD0031994AF3C74/fulltext/pdf

DOI: 10.1007/978-3-540-88693-8_12

Affiliations:

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 001E85
to stream Istex, to step Curation: 001D55
to stream Istex, to step Checkpoint: 000676
to stream Main, to step Merge: 000C21
to stream Main, to step Curation: 000C09

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Movie/Script: Alignment and Parsing of Video and Text Transcription</title>
<author><name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
</author>
<author><name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
</author>
<author><name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
</author>
<author><name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:4D113318F9911978071D0A7B8FD0031994AF3C74</idno>
<date when="2008" year="2008">2008</date>
<idno type="doi">10.1007/978-3-540-88693-8_12</idno>
<idno type="url">https://api.istex.fr/document/4D113318F9911978071D0A7B8FD0031994AF3C74/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">001E85</idno>
<idno type="wicri:Area/Istex/Curation">001D55</idno>
<idno type="wicri:Area/Istex/Checkpoint">000676</idno>
<idno type="wicri:doubleKey">0302-9743:2008:Cour T:movie:script:alignment</idno>
<idno type="wicri:Area/Main/Merge">000C21</idno>
<idno type="wicri:Area/Main/Curation">000C09</idno>
<idno type="wicri:Area/Main/Exploration">000C09</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Movie/Script: Alignment and Parsing of Video and Text Transcription</title>
<author><name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>University of Pennsylvania, 19104, Philadelphia, PA</wicri:regionArea>
<placeName><region type="state">Pennsylvanie</region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2008</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">4D113318F9911978071D0A7B8FD0031994AF3C74</idno>
<idno type="DOI">10.1007/978-3-540-88693-8_12</idno>
<idno type="ChapterID">12</idno>
<idno type="ChapterID">Chap12</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Movies and TV are a rich source of diverse and complex video of people, objects, actions and locales “in the wild”. Harvesting automatically labeled sequences of actions from video would enable creation of large-scale and highly-varied datasets. To enable such collection, we focus on the task of recovering scene structure in movies and TV series for object tracking and action retrieval. We present a weakly supervised algorithm that uses the screenplay and closed captions to parse a movie into a hierarchy of shots and scenes. Scene boundaries in the movie are aligned with screenplay scene labels and shots are reordered into a sequence of long continuous tracks or threads which allow for more accurate tracking of people, actions and objects. Scene segmentation, alignment, and shot threading are formulated as inference in a unified generative model and a novel hierarchical dynamic programming algorithm that can handle alignment and jump-limited reorderings in linear time is presented. We present quantitative and qualitative results on movie alignment and parsing, and use the recovered structure to improve character naming and retrieval of common actions in several episodes of popular TV series.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Pennsylvanie</li>
</region>
</list>
<tree><country name="États-Unis"><region name="Pennsylvanie"><name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
</region>
<name sortKey="Cour, Timothee" sort="Cour, Timothee" uniqKey="Cour T" first="Timothee" last="Cour">Timothee Cour</name>
<name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
<name sortKey="Jordan, Chris" sort="Jordan, Chris" uniqKey="Jordan C" first="Chris" last="Jordan">Chris Jordan</name>
<name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
<name sortKey="Miltsakaki, Eleni" sort="Miltsakaki, Eleni" uniqKey="Miltsakaki E" first="Eleni" last="Miltsakaki">Eleni Miltsakaki</name>
<name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
<name sortKey="Taskar, Ben" sort="Taskar, Ben" uniqKey="Taskar B" first="Ben" last="Taskar">Ben Taskar</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000C09 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000C09 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:4D113318F9911978071D0A7B8FD0031994AF3C74
   |texte=   Movie/Script: Alignment and Parsing of Video and Text Transcription
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Movie/Script: Alignment and Parsing of Video and Text Transcription

Movie/Script: Alignment and Parsing of Video and Text Transcription

Source :

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri